fix: do not inject code above import statements #965
+140
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ES6 Import Statement Handling Fix
Problem
Previously, the JavaScript writer in pywb would insert initialization code at the very beginning of JavaScript files via the
first_buffmechanism. This broke ES6 modules because the ECMAScript specification requires thatimportstatements must appear before any other code in a module.Example of the Problem:
Before the fix:
Solution
Modified the
StreamingRewriterclass inpywb/rewrite/content_rewriter.pyto detect ES6importstatements at the beginning of JavaScript files and insert the initialization code after all leading imports instead of before them.Example After the Fix:
Changes Made
1. Added ES6 Import Detection Regex
Added a regex pattern to
StreamingRewriterclass that matches://and/* */style)importstatements2. New Method:
_insert_with_import_checkAdded a new method that:
first_buffafter all importsfirst_buffat the beginning (original behavior)3. Updated
rewrite_completeMethodModified to use the new
_insert_with_import_checkmethod for proper placement of injected code.4. Updated
rewrite_text_stream_to_genMethodEnhanced the streaming version to handle ES6 imports:
5. Added Test Cases
Added comprehensive test cases in
pywb/rewrite/test/test_content_rewriter.py:test_es6_imports_insertion_after_imports: Basic ES6 import handlingtest_es6_imports_with_comments: Imports with leading commentstest_no_es6_imports_normal_insertion: Backward compatibilityBehavior Summary
Files Modified
pywb/rewrite/content_rewriter.py
IMPORT_REGEXclass variable_insert_with_import_check()methodrewrite_complete()to use new methodrewrite_text_stream_to_gen()to handle streaming with importspywb/rewrite/test/test_content_rewriter.py
test_es6_imports_insertion_after_imports()test_es6_imports_with_comments()test_no_es6_imports_normal_insertion()Testing
All test scenarios pass:
Impact
This fix ensures that pywb can properly handle modern JavaScript ES6 modules without breaking their syntax requirements, making pywb compatible with contemporary JavaScript development practices.
Fix #964